feat(simulation): introduce deterministic simulation framework, scenario models, and validation pipeline v2.0.0#5
Merged
Merged
Conversation
…ild process feat(env): enhance .env.example with new rate limiting and telemetry configurations fix(makefile): correct build and swagger generation commands to use the proper main package
…g and deployment readiness checks
…r anomaly detection
…ication in SimulateAddService feat(predictive): enhance Evaluator with additional latency metrics and network pressure detection test(predictive): add tests for sustained traffic and latency spike scenarios in EvaluateFromSamples
…ackVerifiedAt, and bannerVerified
…trics, dashboard metrics source values, and graph summary for a given run
…each layer when available
…ric name, expected value, actual value)
…system confirmation source
…ack verification source
Add SimulationRequest schema with all 5 locked scenario types, snapshot reference fields, stable validation error codes, and 22 tests covering valid/invalid cases and determinism. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add SimulationResponse schema with required fields (version, scenarioType, snapshotTimestamp, evidenceSources, evidenceMode, confidenceLevel, assumptions, impactedServices, impactedPaths, beforeAfterValues, recommendation), degraded-mode label fields, and ValidateSimulationResponse with stable error codes and 28 tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add EvidenceSourceLabel type with four constants: live_service_graph, live_k8s_runtime, historical_influxdb, deterministic_fallback - Add ResolveEvidenceMode following mandatory tier order: live graph -> live runtime -> Influx history -> deterministic fallback - Add DetermineConfidenceLevel rubric: FULL->HIGH, PARTIAL->MEDIUM, DEGRADED/FALLBACK->LOW (no random weighting) - Add ResolveEvidenceSources, EvidenceModeToTierDescription helpers - Add evidence_test.go with 18 tests covering all paths and determinism Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add snapshot.go with ComposeSnapshot/ComposeSnapshotAt that capture service graph and live Kubernetes runtime truth into a SHA-256-hashed, immutable SimulationSnapshot. All slices are deep-copied and canonically sorted before hashing so identical inputs always produce the same hash regardless of input order or call time. Add snapshot_test.go with 14 tests covering determinism, immutability, order-independence, pointer deep-copy, and hash format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add EvidenceResolverInput, InfluxCheckResult, EvidenceResolverResult, ResolveEvidenceTiers, and ResolveEvidenceTiersFromSnapshot. Resolver follows mandatory tier order (graph->runtime->Influx->fallback), degrades gracefully on Influx unavailability/sparse/error, and never blocks simulation. 22 tests covering all degraded modes, determinism, and snapshot-based resolution. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add execution_core.go with BuildExecutionContext, BuildBaseResponse, NormalizeResponse, CanonicalizeResponse, and stable sort helpers (SortImpactedServices, SortImpactedPaths, SortBeforeAfterValues, SortAssumptions). Add execution_core_test.go with 20 tests verifying determinism, stable sorting, and byte-equal canonical JSON for identical inputs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add RunFailureShutdownScenario using the immutable SimulationSnapshot to compute blast radius, impacted services/paths, deterministic before/after estimates, declared assumptions, and recommendation tied to evidence fields. Returns DEFERRED when target is absent from snapshot rather than guessing values. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…US-010) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds RunNetworkCutScenario with deterministic full-cut and partial-degradation models. Each matched snapshot edge produces before/after BAVs for RPS, error rate, and latency (P95). Missing links return DEFERRED; partial link match proceeds with a note. Includes 15 tests covering DEFERRED, full cut, partial degradation, multi-link, determinism, evidence fields, and field-ref format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add scaling_vm_validation_test.go with 5 reproducible VM test cases covering scale-up (5→10 pods, approve_scale_up), caution scale-down (5→3 pods, caution_scale_down), determinism, degraded-mode without Influx, and a structured pass/fail validation report artifact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Added traffic_spike_vm_validation_test.go with 5 test functions covering: - Moderate spike (2×): full outcome assertions (roles, path sigs, incoming_rps, latency_p95_ms BAVs, monitor_and_prepare_rate_limits recommendation) - High-severity spike (4×): pre_emptive_scale_up_required recommendation and BAVs - Determinism: byte-equivalent canonical JSON across repeated identical runs - Degraded-mode without Influx: OK status with non-none degraded mode label - Structured validation report logged to test output as evidence artifact All 5 tests pass; go build ./... and go test ./pkg/simulation/... both clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n real VMs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add network_cut_vm_validation_test.go with 4 reproducible VM test cases: full cut (RPS→0, error→1.0, latency omitted, failover recommendation), 30% partial degradation (RPS 200→140, error 0.307, latency 45→58.5ms, traffic-shaping recommendation), determinism check, and degraded-mode without InfluxDB. All criteria pass; `go test ./pkg/simulation/...` clean. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add e2e_degraded_traceability_test.go with 7 test functions covering all 3 ACs - AC-1: verify DegradedMode label and EvidenceMode returned for empty/sparse InfluxDB - AC-2: log 27-field UI→contract traceability checklist; assert all 23 required fields populated - AC-3: confirm unknown scenarios rejected, fallback-only evidence deferred with no guessed values, EnforceDeferredConstraints strips synthetic output - All 5 supported scenarios verified runnable in degraded mode without blocking - go test ./pkg/simulation/... and go build ./... pass Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Introduced a new test file `add_test.go` with comprehensive tests for the SimulateAddService function, covering various scenarios including node feasibility, dependency risks, and shared host resource configurations. - Enhanced the AddSimulationRequest struct to include TargetNodeName and improved the AddSimulationResult struct to provide more detailed results, including selected node information and aggregate resources. - Added new types for dependency analysis and risk analysis to better encapsulate the results of service simulations.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces the core simulation framework and supporting infrastructure used for scenario execution, validation, and operational testing of VM environments.